{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Intro to Hidden Markov Models (optional)\n",
    "---\n",
    "### Introduction\n",
    "\n",
    "In this notebook, you'll use the [Pomegranate](http://pomegranate.readthedocs.io/en/latest/index.html) library to build a simple Hidden Markov Model and explore the Pomegranate API.\n",
    "\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "**Note:** You are not required to complete this notebook and it will not be submitted with your project, but it is designed to quickly introduce the relevant parts of the Pomegranate library that you will need to complete the part of speech tagger.\n",
    "</div>\n",
    "\n",
    "The notebook already contains some code to get you started. You only need to add some new functionality in the areas indicated; you will not need to modify the included code beyond what is requested. Sections that begin with **'IMPLEMENTATION'** in the header indicate that you need to fill in code in the block that follows. Instructions will be provided for each section, and the specifics of the implementation are marked in the code block with a 'TODO' statement. Please be sure to read the instructions carefully!\n",
    "\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "**Note:** Code and Markdown cells can be executed using the `Shift + Enter` keyboard shortcut. Markdown cells can be edited by double-clicking the cell to enter edit mode.\n",
    "</div>\n",
    "<hr>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-block alert-warning\">\n",
    "**Note:** Make sure you have selected a **Python 3** kernel in Workspaces or the hmm-tagger conda environment if you are running the Jupyter server on your own machine.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Jupyter \"magic methods\" -- only need to be run once per kernel restart\n",
    "%load_ext autoreload\n",
    "%aimport helpers\n",
    "%autoreload 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# import python modules -- this cell needs to be run again if you make changes to any of the files\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "\n",
    "from helpers import show_model\n",
    "from pomegranate import State, HiddenMarkovModel, DiscreteDistribution"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Build a Simple HMM\n",
    "---\n",
    "You will start by building a simple HMM network based on an example from the textbook [Artificial Intelligence: A Modern Approach](http://aima.cs.berkeley.edu/).\n",
    "\n",
    "> You are the security guard stationed at a secret under-ground installation. Each day, you try to guess whether it’s raining today, but your only access to the outside world occurs each morning when you see the director coming in with, or without, an umbrella.\n",
    "\n",
    "A simplified diagram of the required network topology is shown below.\n",
    "\n",
    "![](_example.png)\n",
    "\n",
    "### Describing the Network\n",
    "\n",
    "<div class=\"alert alert-block alert-warning\">\n",
    "$\\lambda = (A, B)$ specifies a Hidden Markov Model in terms of an emission probability distribution $A$ and a state transition probability distribution $B$.\n",
    "</div>\n",
    "\n",
    "HMM networks are parameterized by two distributions: the emission probabilties giving the conditional probability of observing evidence values for each hidden state, and the transition probabilities giving the conditional probability of moving between states during the sequence. Additionally, you can specify an initial distribution describing the probability of a sequence starting in each state.\n",
    "\n",
    "<div class=\"alert alert-block alert-warning\">\n",
    "At each time $t$, $X_t$ represents the hidden state, and $Y_t$ represents an observation at that time.\n",
    "</div>\n",
    "\n",
    "In this problem, $t$ corresponds to each day of the week and the hidden state represent the weather outside (whether it is Rainy or Sunny) and observations record whether the security guard sees the director carrying an umbrella or not.\n",
    "\n",
    "For example, during some particular week the guard may observe an umbrella ['yes', 'no', 'yes', 'no', 'yes'] on Monday-Friday, while the weather outside is ['Rainy', 'Sunny', 'Sunny', 'Sunny', 'Rainy']. In that case, $t=Wednesday$, $Y_{Wednesday}=yes$, and $X_{Wednesday}=Sunny$. (It might be surprising that the guard would observe an umbrella on a sunny day, but it is possible under this type of model.)\n",
    "\n",
    "### Initializing an HMM Network with Pomegranate\n",
    "The Pomegranate library supports [two initialization methods](http://pomegranate.readthedocs.io/en/latest/HiddenMarkovModel.html#initialization). You can either explicitly provide the three distributions, or you can build the network line-by-line. We'll use the line-by-line method for the example network, but you're free to use either method for the part of speech tagger."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "# create the HMM model\n",
    "model = HiddenMarkovModel(name=\"Example Model\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **IMPLEMENTATION**: Add the Hidden States\n",
    "When the HMM model is specified line-by-line, the object starts as an empty container. The first step is to name each state and attach an emission distribution.\n",
    "\n",
    "#### Observation Emission Probabilities: $P(Y_t | X_t)$\n",
    "We need to assume that we have some prior knowledge (possibly from a data set) about the director's behavior to estimate the emission probabilities for each hidden state. In real problems you can often estimate the emission probabilities empirically, which is what we'll do for the part of speech tagger. Our imaginary data will produce the conditional probability table below. (Note that the rows sum to 1.0)\n",
    "\n",
    "| |  $yes$  | $no$ |\n",
    "| --- | --- | --- |\n",
    "| $Sunny$ |   0.10  | 0.90 |\n",
    "| $Rainy$ | 0.80 | 0.20 |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Looks good so far!\n"
     ]
    }
   ],
   "source": [
    "# create the HMM model\n",
    "model = HiddenMarkovModel(name=\"Example Model\")\n",
    "\n",
    "# emission probability distributions, P(umbrella | weather)\n",
    "sunny_emissions = DiscreteDistribution({\"yes\": 0.1, \"no\": 0.9})\n",
    "sunny_state = State(sunny_emissions, name=\"Sunny\")\n",
    "\n",
    "# TODO: create a discrete distribution for the rainy emissions from the probability table\n",
    "# above & use that distribution to create a state named Rainy\n",
    "rainy_emissions = DiscreteDistribution({\"yes\": 0.8, \"no\": 0.2})\n",
    "rainy_state = State(rainy_emissions, name=\"Rainy\")\n",
    "\n",
    "# add the states to the model\n",
    "model.add_states(sunny_state, rainy_state)\n",
    "\n",
    "assert rainy_emissions.probability(\"yes\") == 0.8, \"The director brings his umbrella with probability 0.8 on rainy days\"\n",
    "print(\"Looks good so far!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **IMPLEMENTATION:** Adding Transitions\n",
    "Once the states are added to the model, we can build up the desired topology of individual state transitions.\n",
    "\n",
    "#### Initial Probability $P(X_0)$:\n",
    "We will assume that we don't know anything useful about the likelihood of a sequence starting in either state. If the sequences start each week on Monday and end each week on Friday (so each week is a new sequence), then this assumption means that it's equally likely that the weather on a Monday may be Rainy or Sunny. We can assign equal probability to each starting state by setting $P(X_0=Rainy) = 0.5$ and $P(X_0=Sunny)=0.5$:\n",
    "\n",
    "| $Sunny$ | $Rainy$ |\n",
    "| --- | ---\n",
    "| 0.5 | 0.5 |\n",
    "\n",
    "#### State transition probabilities $P(X_{t} | X_{t-1})$\n",
    "Finally, we will assume for this example that we can estimate transition probabilities from something like historical weather data for the area. In real problems you can often use the structure of the problem (like a language grammar) to impose restrictions on the transition probabilities, then re-estimate the parameters with the same training data used to estimate the emission probabilities. Under this assumption, we get the conditional probability table below. (Note that the rows sum to 1.0)\n",
    "\n",
    "| | $Sunny$ | $Rainy$ |\n",
    "| --- | --- | --- |\n",
    "|$Sunny$| 0.80 | 0.20 |\n",
    "|$Rainy$| 0.40 | 0.60 |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Great! You've finished the model.\n"
     ]
    }
   ],
   "source": [
    "# create edges for each possible state transition in the model\n",
    "# equal probability of a sequence starting on either a rainy or sunny day\n",
    "model.add_transition(model.start, sunny_state, 0.5)\n",
    "model.add_transition(model.start, rainy_state, 0.5)\n",
    "\n",
    "# add sunny day transitions (we already know estimates of these probabilities\n",
    "# from the problem statement)\n",
    "model.add_transition(sunny_state, sunny_state, 0.8)  # 80% sunny->sunny\n",
    "model.add_transition(sunny_state, rainy_state, 0.2)  # 20% sunny->rainy\n",
    "\n",
    "# TODO: add rainy day transitions using the probabilities specified in the transition table\n",
    "model.add_transition(rainy_state, sunny_state, 0.4)  # 40% rainy->sunny\n",
    "model.add_transition(rainy_state, rainy_state, 0.6)  # 60% rainy->rainy\n",
    "\n",
    "# finally, call the .bake() method to finalize the model\n",
    "model.bake()\n",
    "\n",
    "assert model.edge_count() == 6, \"There should be two edges from model.start, two from Rainy, and two from Sunny\"\n",
    "assert model.node_count() == 4, \"The states should include model.start, model.end, Rainy, and Sunny\"\n",
    "print(\"Great! You've finished the model.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Visualize the Network\n",
    "---\n",
    "We have provided a helper function called `show_model()` that generates a PNG image from a Pomegranate HMM network. You can specify an optional filename to save the file to disk. Setting the \"show_ends\" argument True will add the model start & end states that are included in every Pomegranate network."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAATwAAAB6CAYAAAAic+/DAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAIABJREFUeJzt3XlUU9faB+DfSUIYwpQwCJVJUKkiUEeU2moVtSIO2FontIO3ttahq11fa0eH1um2t97a1qGjtV5t0dZWrBXRWi2o2IoigyhWkUHFAiEhDAGSvN8f3OSCEAwScoDsZ62zlJxhv0l23uyTs8/eHBGBYRjGGgj4DoBhGMZSWMJjGMZqsITHMIzVYAmPYRirwRIewzBWgyU8hmGsBkt4DMNYDZbwGIaxGizhMQxjNUQWLo/d1sEwTEfgTNmItfAYhrEaLOExDGM1LH1K223V1NSgsLAQKpUKKpUKGo0GFRUVAACtVgupVAoHBwfY29vDxcUFEokEHh4eEAjYdw7TPvX19bh16xZKSkpQU1MDAFCr1aiurkZtbS2cnJzg4OAAiUQCV1dX2Nvbw83NDba2tjxHbnks4bVRSUkJ0tLScO7cOZw/fx65ubm4ceMGysrK2nwsW1tb9O3b17AEBwcjMjISffr06YDIma6uuroa6enpSEtLQ1paGi5fvoyCggIUFxdDp9O16VhCoRD+/v4IDg7G/fffj759++KBBx7A0KFDIRQKO+gZ8I+z8PBQXe6ihU6nw9mzZ5GQkICDBw8iPT0dAODn54dBgwahX79+6NmzJ/z8/ODj4wMXFxc4OTlBJBLByckJQEPlUigUqK6uRk1NDZRKJaqqqpCXl4fc3Nwmi1qtRq9evTB+/HiMHz8eUVFRcHZ25vMlYHh0/fp1JCQkICEhAb///jvq6+shlUoxePBghISEwN/fHz4+PvD19YWnpyfs7e0BAHZ2drC3t4ednR0qKipQXV2N6upqQz0sLi5Gbm4uLl26hMuXLyM3NxcKhQIymQxRUVEAgAkTJmDChAno2bMnny+BqdhFC4ZhmMZYC68FcrkcX331FQBg27ZtuHr1KgICAhATE4OJEydi2LBhcHd3N3u5Go0GqampSEpKQlJSEs6ePQtbW1s89thjAIBFixZhxIgRZi+X6Vzq6+uxb98+bN68GcnJyXB1dcWjjz6KmJgYjBgxAoGBgR1S7qVLl5CYmIjDhw8DAE6cOIHa2lqMGTMGCxcuxPTp0zvz6a5JLTwQkSWXTk0ul9PLL79MdnZ25OLiQi4uLrRkyRJKT0/nJZ6ysjLaunUrDRkyhIYMGUIAaPTo0ZScnMxLPEzH0Wg0tGXLFtqyZQt5e3uTUCik2NhYSkxMpLq6Ol5iqqmpof3799PkyZNJKBRS7969afv27aTVanmJ5y5MykEs4RGRTqejLVu2kJubG3l4eNBHH31EKpWKVCoV36E18dtvv9GYMWOI4ziaNWsW3b59m++QGDM4fvw4hYaGklgsJrFYTC+99BLl5+fzHVYTf/31Fy1YsIBEIhENGTKEzp49y3dId2IJzxRlZWU0ZcoUEolE9Oqrr5JCoeA7pLtKSEggf39/8vT0pMTERL7DYe6RRqOhFStWkEAgoJiYGMrNzaXc3Fy+w2pVZmYmjR49msRiMb333nuk0+n4DkmPJby7uXLlCvn7+5Ovry+lpKTwHU6bKJVKmjNnDgmFQvroo4/4Dodpo8rKSho/fjzZ2dnRtm3b+A6nTbRaLb3//vtkY2NDcXFxVFtby3dIRCzhte7KlSvk4+NDw4YNo5KSEr7DuWfr168njuPoX//6F9+hMCaqrq6msWPHkqenZ2c8NTRZUlISOTs7U2xsLNXX1/MdDkt4xsjlcgoICKBhw4ZReXk53+G024cffkgcx1F8fDzfoTB3odPpKDY2lmQyGW8Xw8wpJSWFHB0dadGiRXyHwhKeMdOnT6eePXt26ZbdnZYtW0YSiYT++usvvkNhWrF582YSiUTd6kr7vn37iOM42r17N59hsITXkm+//ZaEQiEdP36c71DMqq6ujsLCwmjs2LF8h8IYce3aNbK3t6eVK1fyHYrZLV68mDw8PKi0tJSvEEzKQVbV8Vij0SAkJATDhw/Hjh07+AylQ5w6dQoPPvggkpKSMG7cOL7DYe7w7LPP4vjx48jJyYFI1L1uY6+oqECfPn3wzDPPYP369XyEYFLHY6tKePHx8YiLi8OlS5cQFBTEZyhmk5WVhZKSEsP9vFOnTgXHcUhMTOQ7NKaRGzduIDAwEFu3bsUzzzzDdzgd4p///CfWrVuHW7duwcHBwdLFm5TwutfXzF3s2bMHY8eO7TbJDmhoNaSmphr+lkgk4DgO0dHRCAwMhI+PD3x8fPDggw8CAHr16sVXqFbtxx9/hL29PeLi4vgOxWz27NmDN954A7169ULfvn3Ro0cPVFVV4auvvsIzzzzDR9K7K6tp4anVari5ueHDDz/Es88+y1cYZrdkyRJ89tlnqK+vb7ZOLBaD4zjU1tYiJiYGAHDgwAFLh8igYeQRV1dXxMfH8x2K2ezYsQNPPfUUAEAkEoHjuCb1UCaTAWj4kh02bBg++eSTjhz/kY2W0tjly5dRXV2Nhx56iO9QzCoyMhJarbbFdXV1daitrQUALFiwAAsWLLBkaEwjZ86cwSOPPMJ3GGY1ZMgQw/81Gk2zL125XA65XI60tDT8+OOPsHDjqkVWk/AYhmGs5je8y5cvQyQSddjQOnyJjIxsdbRbgUCAoKAgTJkyxYJRMY2Vl5dDqVSid+/efIdiVv369YO9vb1hWHljBAIBVq1a1SmGlrKaFt7t27fh4eEBsVjMdyhmFRAQAE9Pz1a3Wb16NQQCAZs/gyc3btwAAPj4+PAciXkJBAKEh4cbXc9xHDiOg5eXF55++mkLRmac1XwCiAgcZ9oYgV3NQw891OK3J8dx8PHxwYwZM3iIirEGI0aMMNqI0Ce8NWvWdJqGhtUkPKFQCI1Gw3cYHWLkyJEttt44jsPKlSu7XSfXrkb/YddfQOpOhgwZ0uLnSv9l6+Pjg3nz5vEQWcus5pPg5+eHkpISqFQqw+Q63UVkZGSL3VI8PT27Vb+vrsrLywscx6GoqKjVU8CuaMiQIS3+hsxxHNauXQsAneoL12paeH379gUR4cqVK3yHYnYDBw5sNseoUCjEm2++2WlOJayZs7MzfHx8kJWVxXcoZtenTx84Ojo2eUwgEMDf3x+zZs3CrFmzeIqsZVaT8IKCguDi4oKTJ0/yHYrZ2djYYNCgQU0ec3FxYf3uOpHw8HCcPXuW7zDMjuO4ZnWPiLB+/XqIRKJO1boDrOiUViQSISoqCocOHcLSpUvNdlyFQgG5XI7y8nLDTO9Aw50djS/XC4VCw/yy+v9LJBLIZDJIpVLY2Ni0K47Ro0cjLS0NdXV1EIlEeO211wxzlDL8e/TRR/Haa69BrVbDzs7OLMfU6XQoKyuDXC5HZWUllEql4fSSiKBQKAA0JCVXV1cAMMyX7OzsDJlMBjc3t3bHMWLECJw5cwa1tbUQCATo3bt3p71QZjW3lgHA9u3bsXjxYty4cQNSqdTodkSE/Px8ADDM7l5YWIj8/HwUFhbi5s2bKC0tRXl5eZtnfDfG0dERMpkM3t7e8PX1ha+vL/z9/eHr62voO9ivX79mp656P//8MyZPngyg4RSqqKio2/1W2ZUVFRXBz88PP//8M6Kjo03aR6lUIjs7G1evXsX169dx/fp1Qx0sLS2FXC5vd1wcx8HNzQ3u7u7w8/NDQEAA/P39ATR0eerduzdCQkIgkUiMHmPv3r2YOXOmoSfE999/j+nTp7c7tjZio6XcSaVSwdfXFytWrMDLL78MAKiqqkJaWhr+/PNPZGdnIysrCzk5OaisrDTs5+TkBD8/P/j5+cHX1xc+Pj5wd3eHVCo1tNBkMhns7OwMN0yLxeImlUSj0UClUgH437dvVVWVoXWovw3n5s2bKCwsRGFhIQoKClBcXGy4JUcoFCIwMBADBgxAv379MGjQIERERMDHxwdlZWXw8PAAEWHVqlVYuXKlpV5WxkSjRo2Ci4sLEhISmjxeWlqKU6dO4Y8//gAAZGZmIiMjA9evXwcA2NraNklGfn5+8PDwgLu7O9zc3ODm5gZHR8dmX+L6vxu39nQ6HZRKJSoqKlBWVobS0lKUlZWhpKQE+fn5hsQKNCTp+vp6CAQCBAYGIjw8HKGhoRg2bBgiIyPh4uICALh+/Tp69eoFjuMQEhKCjIwMPrqAsYTXkn/84x9ITEzE5MmTkZqaiqysLGg0GvTo0QOhoaEICQlBSEgI+vfvD6ChVaW/CZoPdXV1hgstFy9eRHZ2Ni5evIisrCzk5uZCq9WiZ8+eiIiIwPHjx1FdXY2ioiKznKow5rV//37ExsYiJSUF165dw/Hjx3Hy5ElcvnwZHMchODgYABAWFoawsDCEhoYiNDQU/v7+vPQh1Wq1yMvLQ0ZGBjIzM5GZmYn09HRcvXoVAoEAAwYMwMiRIzF69Gg8++yzUCqVOHDggGGgCgtjCQ9oeNN+//13HD58GIcPH8aFCxcgEokwbNgwDBs2DBERERg+fLihGd+VqFQqnD17FqmpqThz5gyOHTsGlUqFHj16YNy4cZgwYQImTpzIkl8nkJ2djZ9++gkbN26EQqGAjY0NIiIiMHLkSERGRiIyMrLVn1k6k7///hunTp1CcnIyTp06hT///BM6nQ6Ojo5YuXIlYmNj+biF0zoTnv43tVOnTmHv3r3Ys2cPiouLERgYiKioKERFRWH8+PGG5nh3c+3aNRw4cAA///wzkpOTodFoMHz4cMOPyLNnz77rrWiMedy8eRN79+7F3r17cfLkSXh4eODRRx/F5MmTu1UdlMvl+P7775GSkoIjR46guLgYgwcPNnQ4tlCdM60JbOpY8GZaOkxeXh699tpr5OXlRV5eXgSAHnjgAVq3bp3VTmyjUqno22+/pWnTppGdnR3Z2dmRWCym2NhYSkpK6kyTKHcrR48epZiYGBIKhSSVSmnhwoX0+++/W8XrrdFo6NChQxQXF0cSiYQkEgmJxWKaOXMmpaamdmTRJuUgq+mHxzAM0+VbeImJiRQTE0MCgYDuu+8+WrlyJa1cuZIuX77cEcV1WUqlkpRKJX3zzTf08MMPEwDq06cPffDBB6RUKvkOr0urq6ujuro62r59O4WFhREAeuSRR+j7778ntVrNd3i8UalUpFKpaMeOHTRkyBACQCNGjKC9e/eSVqs1d3Hdd5pGnU5H+/fvp8GDBxPHcTR27Fj6/vvvO8Ps511GVlYWLV68mJydnUkqldLq1au7xaTklqTT6ei7776jPn36UJ8+fUgsFtO8efPo3LlzfIfWKSUnJ9P06dNJKBRSeHg4HTx40JyH754JLykpiQYOHEgcx9G0adNY5WonhUJBq1evJqlUSq6urvTuu+9STU0N32F1eikpKTRo0CASCAQ0f/58mj9/PuXl5fEdVpeQnZ1N06ZNIwD08MMP04ULF8xx2O6V8IqKiuiJJ54gADRlyhRKT09vz+GYOygUCnrnnXfI0dGRgoKC6JdffuE7pE5JpVLR0qVLSSAQ0KOPPkqZmZl8h9RlnTp1ikaMGEE2Njb09ttvt/f0v/skvG3btpGTkxMFBQWZuxnM3KGwsJBmzJhBAGjGjBkkl8vbdTy1Wk3x8fEUFRVFr732mpmi5Mfp06cpICCA3NzcaOfOnXyH0y1otVratGkTSSQS6t+/P2VnZ9/robp+wlMqlTRz5kwSCAT0+uuvs1MtCzp8+DD17NmT/P396fTp023e//z587R06VJydnYmjuOI4zh6+OGHOyBSy/jss8/I1taWJk2aRLdv3+Y7nG4nLy+PIiMjycnJifbt23cvh+jaCS83N5d69+5NXl5edPTo0Xt5AZh2KikpoejoaLKxsaFt27bddfvy8nL69NNPaejQoQSAxGIxoaGzOQGgoUOHWiBq89LpdLR06VLiOI5WrFjREVcXmf+qra2l559/njiOo3Xr1rV1966b8DIyMsjLy4siIiKouLi4rU+cMSOdTkerV68mjuNow4YNtGHDhibrtVotHTlyhObOnUu2trYkFAqJ47gmiU6/hIaG8vQs7o1Op6MXXniBxGIx7d27l+9wrMbmzZuJ4zhau3ZtW3YzKQd1qvHw0tPTAQBjx45FeHg4EhISmo2mylgWx3FYsWIF3NzcsGzZMgBATU0NFixYgN27d+OTTz5BUVERRCLRXecMqaqqwtGjR42ur6ioMDqpeGP29vZGx5RzdHQ0jC1oa2trGL1GIBAYbuVydnY2acrA//u//8Pnn3+O+Ph4xMbG3nV7xjxeeOEFCIVCLFq0yPA+60c3ajdTM6OZFqNu375Nfn5+5OfnR2PHju2Q3+suXLhAs2bNoqCgILK1tSWZTEajRo2iDRs20KVLl8xeXnfzxRdf0BdffNFi682URSaT3fO+5l44jiOpVEqenp7k6elJgYGBFBISQoMHD6bRo0dTbGwscRxHu3fvNutr2FpMYrGYxGIxhYWFmb3cruiDDz4goVBIQqGQfvvtt7ttblIO6hSDB2g0GowZMwY3b94EAPzxxx9mH5Lp0KFDmDx5MsLCwvDJJ58gPDwcFRUV+OWXX/DSSy9BpVLBwq9FlxUbG2sY000/FZ8pM8L16NEDOTk5Rtc7ODgYHeBUj+h/Y7u1pHErsaamBmq1GkDT8QiVSiU0Gg2USqVh8qPKykrD9kVFRfjhhx+wcOFCbNq06a7P617oh3vS1zmdToeLFy8CAJ588kmcO3cOiYmJmDBhQrvKeeihhwAAycnJ7ToOX/SDXpw8eRLp6emtDULQdQYP+Pjjj8nOzo6ysrIoKyvLpOzfVg888AABoIyMjGbrPvnkE2p4KRhT1NfX06hRo2jgwIEUHx9Pc+bMIXt7e+I4jmxsbIy2YKRSKd+hmyQmJoYGDhxItbW1HVaG/jVpye+//04A6KGHHmp3OZGRkRQZGdnu4/BFoVCQQqEgPz8/eu6551rbtGtctJDL5eTu7t7hfbRsbW0JAFVWVjZbV1hYyBJeG2VnZ5NIJKKvv/6aiIiqq6spISGB4uLiyNHRkQA0S36Ojo48R313KSkpBKDDewa0lvCUSiUBIDc3tw6NoSvZvn07CYVCysnJMbZJ10h47777Lrm7u3f4Dex+fn4EgL755psOLceaPPfccxQYGNhs2KPa2lo6dOgQLVy4kNzc3AgACQQCEovFPEVquilTptDo0aM7vBxTEp6rq2uHx9FVaDQa6tevHz3//PPGNjEpB7HhoRiGsR6mZkYzLc2EhITQsmXL7i3tt8Grr75KAEgoFNL8+fPp2LFjpNFojG6PO35/MrbO2OMFBQU0ZcoUmjJlCjk6OpKnpyfNnTuXSktL77qPqdvrl2+//daw3t/fv9XWgzmdP3+eANCZM2eMbqPVaik5OZlefvllmjVrVofH1B4qlYrs7OwMp+kdqbX36MSJEwSAoqKimq07cuQITZ48mVxdXcnW1pYGDhzY5P1vqQxT6umdda5xvessdW7t2rXUo0cPY52/O/8pbU5ODgGglJSU9r0SJqiqqqI5c+Y0edNcXV1p1qxZdODAAaOj0Rp7I+/2+Ny5c+nixYt08eJFUigUtGjRIgJATz311F33udv2R48eJQDk7e1NdXV1TdZ9/vnnNGnSJFNflnYLDg6mV155xWLldaQffviBRCJRsy+ZjnBn/dFqtZSZmUmZmZk0aNAgkslkdPbs2Rb3mzZtGpWUlFB+fj6NGzeOAFBiYqJJ5dz5uLE6d2e96wx1Ljs7mwAYu9Wx8ye8PXv2kFAotOg4dhkZGfTKK69QcHBwk+Q3YsQI+vvvv5ttf68J7/jx400ez8vLIwB03333mbRPa9sTEYWHhxMA2rFjR5PHQ0ND6ciRIy0/+Q7w9NNP08SJEy1WXkd69913KTg42CJltdRq0i9z5syhmzdvGt2v8TBU+kaDsSu6bamnjetcS/WuM9Q5Jycn+vLLL1ta1fkT3saNG8nHx6d9r0A7XLlyhd5++23DVcUnn3yy2Tb3mvAqKiqaPF5bW0tAQ4dXU/ZpbXsioq+//pqAhnk79H799VcKCQkx+nw7wooVK2jAgAEWLbOjLFq0iB555BGLlNW4/uh0Orpw4YKh4z3HccY+1M1oNJpWr+i2pZ42rnMt1bvOUOeCg4PpnXfeaWmVSTmI14sWSqUSzs7OvJXfu3dvvPPOO9i7dy8AIDEx0WzHdnJyavK3WCwG0PAFY8o+d9t+9uzZ8Pb2Rnp6Oo4dOwYA2LRpE1588cV2xd1WLi4urXYE7koqKiqavW+WwHEcwsLCsGXLFmzZsgVEhFdffdXQUVpPoVDgjTfeQL9+/eDk5ASO4yASNdwdWlZWdk9lG6tzLdW7zlDnXFxcoFQq73l/XhOet7c3bt26ZZGyBAIBbt++3eI6fW/0ioqKZuv0PeL1PfIBtOsFNxexWIwlS5YAADZu3Ihr167h9OnTiIuLs2gcN2/eRM+ePS1aZkfx8vIyWkcsYdKkSZg0aRJGjhyJsrIy/Pvf/26y/oknnsD69esxc+ZM5OfnG01MHaUz1Llbt27B29v73g9galPQTEsTBw8eJACkUCjusYFrOgD06aeftrju+PHjBIAefPDBZuu8vb0JAOXn5xse++2339p8qtvauns5FhFRWVkZOTg4EMdxNGnSJHr99deNbttRHn/8cXrssccsXm5H2LhxI3l7e1ukrNbeW319dHFxaTIAq4ODQ7PTULVa3SF1ztjx+KxzdXV1ZGNjY+yqdOf/DU8ul5NYLLbIjdJAQ0//Dz74gPLy8kitVtOtW7do165d5OPjQ/b29i1eLZ4/fz4BoCVLlpBCoaCcnByKi4vrFAmPiAxX1kQiERUVFbW6rbmp1WpycXGhTZs2WbRcU9XX17dpsM7Tp08TgA67vbGxu723UVFRBKBJQpkwYYLhsfLyciorK6OXX37ZogmPiL86d/jw4WYXbRrp/AmPiGjixIk0derUe34RTHXhwgVasWIFjRo1ijw9PUkkEpGtrS317t2bFixYQBcvXmxxv5KSEpozZw55eHiQRCKhyZMnU0FBQYsVo/Fjd1YYY+tMfdxYBczNzSWBQMBLH7f9+/eTQCCgGzduWLxsU2zZsoU4jqPw8HBas2bNXROZVqslb29vWrNmTYfF1NL72tJ7m5qa2mT9+vXr6fbt2zRv3jzy9PQksVhMAwYMoPj4eKupcy+88AINGjTI2OqukfC+++47EgqF5pq5yOroP6T3Mgx7e+h0Oho+fDhNmDDBouW2xUcffURCodDQGgFAvr6+9Morr1BKSkqLHVhfeukl8vHxoerqah4i7hr4qHN///03OTk50fvvv29sk66R8HQ6HQ0bNozGjBlzb6+ElUtISKCIiAiLl7tz585O/0X12WefGRJd40U/qIGLiwvFxcVRXFwc7dmzhyorK035YFk9Purciy++SN7e3i0O/vFfXSPhETWMUMFxHG3bts2kuROsHdDQ21wul9PgwYNp//79Fi2/oKCAPD09aeHChRYtt62++eYbEggERk8j9clPnwDFYjFFR0dTdHQ0OTs709WrV/l+Cp0GX3Xujz/+oD/++IPEYjFt3bq11RBNWTrFAKAAsGrVKqxbtw4AcOTIEYwaNcpiQXU1+q4ybm5uWLJkCVatWmWRcvWDaQ4fPhyXL19GSEgIpFIpOI6Dq6srAMDV1RUcxxmGUW+8Tk8/1LpA8L9eURKJxNAP7G4aD+bZWG1tLaqrqw1/Z2ZmYteuXW16jgKBADqdDhzHISgoCBkZGbC3t2/TMbojPuqcQqHA4MGDAQABAQFISkpqbWh+kwYA7TQJj4gwffp0AA2jmx45cgTh4eEWC4xpXW1tLWbOnAkAOHHiBGbOnAmJRILKykpotVpUVFSA6H+jESsUChBRs+TUeJvGTJ3PQk8qlTZ7TCQSNelIW11djeLiYpOPqZ+Xw8vLCzNmzMDXX3+NKVOmYMeOHSbNgcGYj1qtxrRp05CVlQUAOH/+PDw8PFrbxaSEx4aHYhjGeph67mumpVUqlYpUKhWNGTOGpFIpnTp16m67MBZQWVlJUVFR5OrqSq6urnTy5Em+QzKJfoQPYwvHcSQSiQzLxIkTac+ePYbBLJKSksje3p7mz5/P5qO1ILVaTRMnTiSpVEppaWmUlpZmym5d56LFnWpqamjq1KkkkUiMjvXFWEZeXh4NGTKEevToQefPn6fz58/zHZLJTp48afRCBQAKCgoyzLVrrIPy4cOHyc7OjmbOnElVVVUWfgbWp6yszPDl+ueff7Zl166b8IgaeskvW7aMOI6jhQsXsn5RPNi3bx+5urpSWFgYXblyhe9w2uzcuXOGJCcQCEggEJC9vT09/fTTbTp7OHbsGLm5uVF4eDhdu3atAyO2bhcuXKDAwEDy8/Ojc+fOtXX3rp3w9Pbt20dSqZT69+9PJ06cuJdDMG30999/09NPP00A6LnnnuuyXzb6seIA0NChQ+nLL78klUp1T8e6du0aPfDAAySTyWjnzp1mjtS6abVa+vjjj0kikdDo0aNbHJfSBN0j4RE1nFZFR0cTx3E0b948unXr1r0eimmFVqulrVu3kkwmo549e9IPP/zAd0jtUl9fT5s3b6bs7GyzHK+qqopeeOEFEggEFB0dTQUFBWY5rjXLycmhBx98kGxsbOitt95qz2DA3Sfh6SUkJFBAQAA5OjrSsmXLWOIzE61WSwkJCTRw4EASiUS0bNmyDp9Fris7efIk3X///SSRSGj58uVUXl5O5eXlfIfVpZSWltLy5cvJzs6OwsPDWxzOvo26X8IjarhiuGHDBvLw8CBHR0davnx5p715vbOrq6uj3bt3U0hIiOFmcGODKDBN1dTU0IYNG0gqlZK7uzu5u7vTBx980NqtTww1jJC0atUqcnJyIm9vb9qyZYu5pnjonglPr7Kykt577z3y9PQkGxsbevzxx+nXX38lnU5ndEIepkF+fj699dZb5O3tTUKhkGbPnm220z5rI5fLafny5bR8+XJycHAgqVRKy5cvp8LCQr7GYPvXAAAFdklEQVRD61Ryc3Np8eLFJJFIyNXVldauXWvuL4funfD01Go17dy5kyIjIwkABQcHU3BwMK1ataq1WcqtikKhoB07dtCOHTsoOjqahEIheXl50ZtvvtlkYFOmfUpLS2nt2rV03333kY2NDT322GP0008/UW1tLd+hWVxVVRXt3r2bdu/eTRMnTiSBQEBBQUH04YcfNpvvxUysI+E1lp6eTkuXLqWlS5caRioOCwujNWvW0Llz56yq5VdcXEw7duygqVOnkq2tLYnFYhKLxRQTE0Px8fFW+SG0lNraWvrPf/5DY8aMIYFAQG5ubrR48WI6duxYsykOu5Pq6mr65Zdf6KmnniInJyfDwAwxMTH0008/dXTnbZNyUKe5l9bcdDodkpOTER8fjx9//BHFxcXw9PTEuHHjMG7cODzyyCPw8/OzVDgdTqVSITU1FUeOHEFSUhIyMjIgFosxZswYPPHEE5g2bRoANLuRn+lYhYWF2LVrF3bt2oWsrCy4uLhg/PjxmDRpEqKiorr8fCDXrl1DUlISDh48iGPHjqGmpgZDhw7F3LlzMXv2bAC42z2w5tK1Bg/o0EKJcOHCBSQlJSEpKQkpKSmora2Ft7c3IiIiMHz4cERERCAsLAwymYyPENukrq4Oly5dQlpaGlJTU3H69GlcvHgRWq0W999/P8aPH48JEyZg1KhRkEgkfIfL/Ne1a9dw8OBBHDx4ECdOnIBarUavXr0wcuRIjBw5EsOHDwcA9OvXDzY2NjxH25xarUZ2djZOnTqFlJQUpKSk4ObNm5BIJBg3bhyio6MxadIk3HfffXyExxKeMdXV1Th79ixSU1MNi372tB49eiAkJAT9+/cH0FD5/Pz8EBAQAF9fX8PQRh2prq4ORUVFKCwsRH5+PnJzcwEAOTk5yM7OxtWrV6HRaGBvb4/BgwcjIiICI0aMwPDhw7t8i8FaVFVV4cyZM0hOTsbJkydx+vRpVFZWAgBsbGxw//33IzQ0FKGhoejTpw8CAgLg7+8Pd3f3Do3r9u3buH79OgDg+vXruHLlCjIyMpCZmYkrV65Aq9VCKpUiMjLSkKiHDh0KW1vbDo3LBCzhtUVhYSEuXryI7OxsQ2IBgMuXL0Mulxu2c3Z2hq+vL9zc3CCTySCTySCVSiGTySAWiw3z7AqFwiZz7jYeq02tVqOmpgYVFRUoLy+HXC43LLdu3cKtW7egf19sbW0RGBgIAAgJCUG/fv0M/3bWlgDTdhqNBjk5OQAaxvHTJ5msrCwUFhYa6oOjoyMCAgLg7u4OmUwGd3d3uLu7w9nZGQ4ODk0Sj35Mwvr6ekMyrampgVqtRnl5OUpLS1FWVga5XI7S0lLk5eWhpqbGsL9QKIS/vz/CwsIwYMAAhIWFITQ0FH379m0ylmEnwYaHYhiGaYy18ExQWVmJgoIC5Ofno7CwEEVFRZDL5YbWmf5fjUZjGNzyzoEvbW1t4eDgAACGb2JnZ2dD61D/r7e3N3x9feHr6ws/Pz94eXnx8pyZzqO2thYFBQW4fv068vPzUVBQgLKyMpSVlRlaaRUVFYbWm55SqYROp2syMKq+7kmlUri5uRnOVNzd3eHv74+AgAAEBAQAAHx8fLrSGQQ7pWUYxmqwU1qGYZjGWMJjGMZqsITHMIzVEFm4vL0WLo9hGMbA0hctGIZheMNOaRmGsRos4TEMYzVYwmMYxmqwhMcwjNVgCY9hGKvBEh7DMFaDJTyGYawGS3gMw1gNlvAYhrEaLOExDGM1WMJjGMZqsITHMIzVYAmPYRirwRIewzBWgyU8hmGsBkt4DMNYDZbwGIaxGizhMQxjNVjCYxjGarCExzCM1WAJj2EYq8ESHsMwVoMlPIZhrMb/A/denqxh0nfVAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x7fc0b8471550>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_model(model, figsize=(5, 5), filename=\"example.png\", overwrite=True, show_ends=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Checking the Model\n",
    "The states of the model can be accessed using array syntax on the `HMM.states` attribute, and the transition matrix can be accessed by calling `HMM.dense_transition_matrix()`. Element $(i, j)$ encodes the probability of transitioning from state $i$ to state $j$. For example, with the default column order specified, element $(2, 1)$ gives the probability of transitioning from \"Rainy\" to \"Sunny\", which we specified as 0.4.\n",
    "\n",
    "Run the next cell to inspect the full state transition matrix, then read the . "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The state transition matrix, P(Xt|Xt-1):\n",
      "\n",
      "[[ 0.   0.5  0.5  0. ]\n",
      " [ 0.   0.8  0.2  0. ]\n",
      " [ 0.   0.4  0.6  0. ]\n",
      " [ 0.   0.   0.   0. ]]\n",
      "\n",
      "The transition probability from Rainy to Sunny is 40%\n"
     ]
    }
   ],
   "source": [
    "column_order = [\"Example Model-start\", \"Sunny\", \"Rainy\", \"Example Model-end\"]  # Override the Pomegranate default order\n",
    "column_names = [s.name for s in model.states]\n",
    "order_index = [column_names.index(c) for c in column_order]\n",
    "\n",
    "# re-order the rows/columns to match the specified column order\n",
    "transitions = model.dense_transition_matrix()[:, order_index][order_index, :]\n",
    "print(\"The state transition matrix, P(Xt|Xt-1):\\n\")\n",
    "print(transitions)\n",
    "print(\"\\nThe transition probability from Rainy to Sunny is {:.0f}%\".format(100 * transitions[2, 1]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Inference in Hidden Markov Models\n",
    "---\n",
    "Before moving on, we'll use this simple network to quickly go over the Pomegranate API to perform the three most common HMM tasks:\n",
    "\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "**Likelihood Evaluation**<br>\n",
    "Given a model $\\lambda=(A,B)$ and a set of observations $Y$, determine $P(Y|\\lambda)$, the likelihood of observing that sequence from the model\n",
    "</div>\n",
    "\n",
    "We can use the weather prediction model to evaluate the likelihood of the sequence [yes, yes, yes, yes, yes] (or any other state sequence). The likelihood is often used in problems like machine translation to weight interpretations in conjunction with a statistical language model.\n",
    "\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "**Hidden State Decoding**<br>\n",
    "Given a model $\\lambda=(A,B)$ and a set of observations $Y$, determine $Q$, the most likely sequence of hidden states in the model to produce the observations\n",
    "</div>\n",
    "\n",
    "We can use the weather prediction model to determine the most likely sequence of Rainy/Sunny states for a known observation sequence, like [yes, no] -> [Rainy, Sunny]. We will use decoding in the part of speech tagger to determine the tag for each word of a sentence. The decoding can be further split into \"smoothing\" when we want to calculate past states, \"filtering\" when we want to calculate the current state, or \"prediction\" if we want to calculate future states. \n",
    "\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "**Parameter Learning**<br>\n",
    "Given a model topography (set of states and connections) and a set of observations $Y$, learn the transition probabilities $A$ and emission probabilities $B$ of the model, $\\lambda=(A,B)$\n",
    "</div>\n",
    "\n",
    "We don't need to learn the model parameters for the weather problem or POS tagging, but it is supported by Pomegranate.\n",
    "\n",
    "### IMPLEMENTATION: Calculate Sequence Likelihood\n",
    "\n",
    "Calculating the likelihood of an observation sequence from an HMM network is performed with the [forward algorithm](https://en.wikipedia.org/wiki/Forward_algorithm). Pomegranate provides the the `HMM.forward()` method to calculate the full matrix showing the likelihood of aligning each observation to each state in the HMM, and the `HMM.log_probability()` method to calculate the cumulative likelihood over all possible hidden state paths that the specified model generated the observation sequence.\n",
    "\n",
    "Fill in the code in the next section with a sample observation sequence and then use the `forward()` and `log_probability()` methods to evaluate the sequence."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "            Rainy      Sunny      Example Model-start      Example Model-end   \n",
      " <start>      0%         0%               100%                     0%          \n",
      "   yes       40%         5%                0%                      0%          \n",
      "    no        5%        18%                0%                      0%          \n",
      "   yes        5%         2%                0%                      0%          \n",
      "\n",
      "The likelihood over all possible paths of this model producing the sequence ['yes', 'no', 'yes'] is 6.92%\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# TODO: input a sequence of 'yes'/'no' values in the list below for testing\n",
    "observations = ['yes', 'no', 'yes']\n",
    "\n",
    "assert len(observations) > 0, \"You need to choose a sequence of 'yes'/'no' observations to test\"\n",
    "\n",
    "# TODO: use model.forward() to calculate the forward matrix of the observed sequence,\n",
    "# and then use np.exp() to convert from log-likelihood to likelihood\n",
    "forward_matrix = np.exp(model.forward(observations))\n",
    "\n",
    "# TODO: use model.log_probability() to calculate the all-paths likelihood of the\n",
    "# observed sequence and then use np.exp() to convert log-likelihood to likelihood\n",
    "probability_percentage = np.exp(model.log_probability(observations))\n",
    "\n",
    "# Display the forward probabilities\n",
    "print(\"         \" + \"\".join(s.name.center(len(s.name)+6) for s in model.states))\n",
    "for i in range(len(observations) + 1):\n",
    "    print(\" <start> \" if i==0 else observations[i - 1].center(9), end=\"\")\n",
    "    print(\"\".join(\"{:.0f}%\".format(100 * forward_matrix[i, j]).center(len(s.name) + 6)\n",
    "                  for j, s in enumerate(model.states)))\n",
    "\n",
    "print(\"\\nThe likelihood over all possible paths \" + \\\n",
    "      \"of this model producing the sequence {} is {:.2f}%\\n\\n\"\n",
    "      .format(observations, 100 * probability_percentage))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### IMPLEMENTATION: Decoding the Most Likely Hidden State Sequence\n",
    "\n",
    "The [Viterbi algorithm](https://en.wikipedia.org/wiki/Viterbi_algorithm) calculates the single path with the highest likelihood to produce a specific observation sequence. Pomegranate provides the `HMM.viterbi()` method to calculate both the hidden state sequence and the corresponding likelihood of the viterbi path.\n",
    "\n",
    "This is called \"decoding\" because we use the observation sequence to decode the corresponding hidden state sequence. In the part of speech tagging problem, the hidden states map to parts of speech and the observations map to sentences. Given a sentence, Viterbi decoding finds the most likely sequence of part of speech tags corresponding to the sentence.\n",
    "\n",
    "Fill in the code in the next section with the same sample observation sequence you used above, and then use the `model.viterbi()` method to calculate the likelihood and most likely state sequence. Compare the Viterbi likelihood against the forward algorithm likelihood for the observation sequence."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The most likely weather sequence to have generated these observations is ['Rainy', 'Sunny', 'Rainy'] at 2.30%.\n"
     ]
    }
   ],
   "source": [
    "# TODO: input a sequence of 'yes'/'no' values in the list below for testing\n",
    "observations = ['yes', 'no', 'yes']\n",
    "\n",
    "# TODO: use model.viterbi to find the sequence likelihood & the most likely path\n",
    "viterbi_likelihood, viterbi_path = model.viterbi(observations)\n",
    "\n",
    "print(\"The most likely weather sequence to have generated \" + \\\n",
    "      \"these observations is {} at {:.2f}%.\"\n",
    "      .format([s[1].name for s in viterbi_path[1:]], np.exp(viterbi_likelihood)*100)\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Forward likelihood vs Viterbi likelihood\n",
    "Run the cells below to see the likelihood of each sequence of observations with length 3, and compare with the viterbi path."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The likelihood of observing ['no', 'no', 'yes'] if the weather sequence is...\n",
      "\t('Sunny', 'Sunny', 'Sunny') is 2.59% \n",
      "\t('Sunny', 'Sunny', 'Rainy') is 5.18%  <-- Viterbi path\n",
      "\t('Sunny', 'Rainy', 'Sunny') is 0.07% \n",
      "\t('Sunny', 'Rainy', 'Rainy') is 0.86% \n",
      "\t('Rainy', 'Sunny', 'Sunny') is 0.29% \n",
      "\t('Rainy', 'Sunny', 'Rainy') is 0.58% \n",
      "\t('Rainy', 'Rainy', 'Sunny') is 0.05% \n",
      "\t('Rainy', 'Rainy', 'Rainy') is 0.58% \n",
      "\n",
      "The total likelihood of observing ['no', 'no', 'yes'] over all possible paths is 10.20%\n"
     ]
    }
   ],
   "source": [
    "from itertools import product\n",
    "\n",
    "observations = ['no', 'no', 'yes']\n",
    "\n",
    "p = {'Sunny': {'Sunny': np.log(.8), 'Rainy': np.log(.2)}, 'Rainy': {'Sunny': np.log(.4), 'Rainy': np.log(.6)}}\n",
    "e = {'Sunny': {'yes': np.log(.1), 'no': np.log(.9)}, 'Rainy':{'yes':np.log(.8), 'no':np.log(.2)}}\n",
    "o = observations\n",
    "k = []\n",
    "vprob = np.exp(model.viterbi(o)[0])\n",
    "print(\"The likelihood of observing {} if the weather sequence is...\".format(o))\n",
    "for s in product(*[['Sunny', 'Rainy']]*3):\n",
    "    k.append(np.exp(np.log(.5)+e[s[0]][o[0]] + p[s[0]][s[1]] + e[s[1]][o[1]] + p[s[1]][s[2]] + e[s[2]][o[2]]))\n",
    "    print(\"\\t{} is {:.2f}% {}\".format(s, 100 * k[-1], \" <-- Viterbi path\" if k[-1] == vprob else \"\"))\n",
    "print(\"\\nThe total likelihood of observing {} over all possible paths is {:.2f}%\".format(o, 100*sum(k)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Congratulations!\n",
    "You've now finished the HMM warmup. You should have all the tools you need to complete the part of speech tagger project."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
